Word Buffering Models for Improved Speech Repair Parsing
نویسنده
چکیده
This paper describes a time-series model for parsing transcribed speech containing disfluencies. This model differs from previous parsers in its explicit modeling of a buffer of recent words, which allows it to recognize repairs more easily due to the frequent overlap in words between errors and their repairs. The parser implementing this model is evaluated on the standard Switchboard transcribed speech parsing task for overall parsing accuracy and edited word detection.
منابع مشابه
An improved joint model: POS tagging and dependency parsing
Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...
متن کاملCombining semantic and syntactic structure for language modeling
Structured language models for speech recognition have been shown to remedy the weaknesses of n -gram models. All current structured language models, however, are limited in that they do not take into account dependencies between non-headwords. We show that non-headword dependencies contribute significantly to improved word error rate, and that a data-oriented parsing model trained on semantica...
متن کاملStructured language modeling
This paper presents an attempt at using the syntactic structure in natural language for improved language models for speech recognition. The structured language model merges techniques in automatic parsing and language modeling using an original probabilistic parameterization of a shift-reduce parser. A maximum likelihood re-estimation procedure belonging to the class of expectation-maximizatio...
متن کاملProsodic scoring of word hypotheses graphs
Prosodic boundary detection is important to disam-biguate parsing, especially in spontaneous speech, where elliptic sentences occur frequently. Word graphs are an eecient interface between word recognition and parser. Prosodic classiication of word chains has been published earlier. The adjustments necessary for applying these classiication techniques to word graphs are discussed in this paper....
متن کاملImproving parsing of spontaneous speech with the help of prosodic boundaries
Parsing can be improved in automatic speech understanding if prosodic boundary marking is taken into account, because syntactic boundaries are often marked by prosodic means. Because large databases are needed for the training of statistical models for prosodic boundaries, we developed a labeling scheme for syntactic{prosodic boundaries within the German Verbmobil project (automatic speech{to{s...
متن کامل